Metadata Management in Scientific Computing

نویسنده

  • Eric L. Seidel
چکیده

Complex scientific codes and the datasets they generate are in need of a sophisticated categorization environment that allows the community to store, search, and enhance metadata in an open, dynamic system. Currently, data is often presented in a read-only format, distilled and curated by a select group of researchers. We envision a more open and dynamic system, where authors can publish their data in a writeable format, allowing users to annotate the datasets with their own comments and data. This would enable the scientific community to collaborate on a higher level than before, where researchers could for example annotate a published dataset with their citations. Such a system would require a complete set of permissions to ensure that any individual’s data cannot be altered by others unless they specifically allow it. For this reason datasets and codes are generally presented read-only, to protect the author’s data; however, this also prevents the type of social revolutions that the private sector has seen with Facebook and Twitter. In this paper, we present an alternative method of publishing codes and datasets, based on Fluidinfo, which is an openly writeable and social metadata engine. We will use the specific example of the Einstein Toolkit, a shared scientific code built using the Cactus Framework, to illustrate how the code’s metadata may be published in writeable form via Fluidinfo.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Exascale Scientific Metadata Management

Advances in technology and computing hardware are enabling scientists from all areas of science to produce massive amounts of data using large-scale simulations or observational facilities. In this era of data deluge, effective coordination between the data production and the analysis phases hinges on the availability of metadata that describe the scientific datasets. Existing workflow engines ...

متن کامل

Metadata for multilingual content management - a practical experience with the SARE-Bi system

Introduction This paper describes a multilingual document managing system, SARE-Bi, that is based on the use of metadata. In this system, metadata have the role of controlling all phases of a document’s life cycle, from the drafting of the first version up to the reutilization of published material, including all intermediate phases of translation, post-edition, validation, publication, and oth...

متن کامل

Sparse Cross-Products of Metadata in Scientific Simulation Management

Managing scientific data is by no means a trivial task even in a single site environment with a small number of researchers involved. We discuss some issues concerned with posing well-specified experiments in terms of parameters or instrument settings and the metadata framework that arises from doing so. We are particularly interested in parallel computer simulation experiments, where very larg...

متن کامل

Design and Implementation of a Comprehensive Database of the Written Heritage of Science and Technology

Purpose: This study aims to design and implement a comprehensive database of the written heritage of science and technology in the Regional Information Center for Science and Technology (RICeST) and determine the metadata elements required to describe the manuscripts. Method: This study was carried out by the content analysis method to identify the metadata elements needed to describe the coll...

متن کامل

The instantiation of OmniPaper RDF prototype in the context of scientific publications

Purpose of this paper The purpose of this paper is to present an instance of the system developed in the OmniPaper project, regarding the mechanisms of distributed information retrieval. These mechanisms were developed for newspapers’ articles and they were then instantiated in the context of the scientific publication. Another goal concerns the use of a central metadatabase developed to accomp...

متن کامل

Maitri: Format Independent Data Management for Scientific Data

Today’s scientific applications are very data intensive, and their data management requirements can no longer be met by special-purpose libraries for particular scientific data formats or by traditional database management systems. This paper proposes Maitri, a data-format-independent, loosely-coupled, application-tailorable set of libraries that provides a holistic data management framework fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1203.4135  شماره 

صفحات  -

تاریخ انتشار 2012